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ABSTRACT 

> , 

, We have used the Largest Cluster Statistics and the Average Filamentarity to quantify respectively the connectivity and 
the shapes of the patterns seen in the galaxy distribution in two volume limited subsamples extracted from the equatorial 
strips of the Sloan Digital Sky Survey (SDSS) Data Release One (DR1). The data was projected onto the equatorial plane 
' and analyzed in two dimensions (2D). Comparing the results with Poisson point distributions at various levels of smoothing 
we find evidence for a network like topology with filaments being the dominant patterns in the galaxy distribution. With 
■ increasing smoothing, a transition from many individual filamentary structures to an interconnected network is found to 
pH , occur at a filling factor 0.5 — 0.6. We have tested the possibility that the connectivity and the morphology of the patterns in 
O i 1 the galaxy distribution may be luminosity dependent and find significant evidence for a luminosity- morphology relation, the 
^ , brighter galaxies exhibiting lowers levels of connectivity and filamentarity compared to the fainter ones. Using a statistical 
5_l ■ technique, Shuffle, we show that the filamentarity in both the SDSS strips is statistically significant up to 80 /i _1 Mpc but not 
ryj , beyond. Larger filaments, though identified, are not statistically significant. Our findings reaffirm earlier work establishing the 
C$ • filaments to be the largest known statistically significant coherent structures in the universe. 
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1 INTRODUCTION 

Quantifying the clustering pattern observed in the galaxy distribution is one of the central themes in modern cosmology. 
A striking feature visible in all redshift surveys is that the galaxies appear to be distributed along filaments which are 
interconnected and form a network, often referred to as the "Cosmic Web". In this paper we quantify the inter-connectivity 
and the shapes of the patterns s een in t he galaxy distribut ion in the SDSS jYork et alibOQOl) . 

T he percolation analysis (eg. lshandarin fc Zeldovicbll983lEinasto et alll984l) and the genus stat istics (eg. |Gott. Dick inson, fc Melottl 
Il986h are some of the earliest statistics introduced to quantify the topology of the galaxy distribution. lShandarin j^^ssTi^9^) 
used the Largest Cluster Statistics (LCS), a percolation technique de veloped for point- wise distributions, to analyze the con- 
nectivity of structures in the Las Campanas Redshift Survey (LCRS: Ishectman et, alJ|l99fiF ). The thickness of the six LCRS 
slices is very small compared to the two other dimensions, and the analysis was carried out on two dimensional (2D) projec- 
tions. The LCS analysis focuses on the growth of the largest cluster with increasing smoothing. A growth faster than that of a 
random Poisson point distribution indicates a network topology whereas a slower growth indicates a meatball topology. The 
LCS analysis shows the presence of a high level of inter-connectivity indicative of a network like structure in the distribution 
of the LCRS galaxies. 

The M inkows ki functionals have been suggested as a novel t ool to study the morphology of structures in the universe 
(ee. lMecke. Buchert. fc WagnerlFl994l . lSchmalzing fc BucherJl99^) . Ratios of the Minkowski functionals can be used to define 
a shape diagnostic 'Shapefinders' which faithfully quantifies the shapes of both simple and topologically complex objects 
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iSahni. Sathvararakash. fc Shandarinlll99il) . IBharadwai et alJ feOQO^ defined the Shapefinder statistics in 2D, and have used 
this to demonstrate that the galaxy d istribution in the LCRS exhibits a high degree of filamentarity compared to a random 
Poisson distri bution. In a later pape r, IBharadwai et alJ J2004) used Shapefinders in conjunction with a statistical technique 
called Shuffle l|Bhavsar fc Linglll988h to determine the maximum length-scale at which the filaments observed in the LCRS 
are statistically significant. They found that the largest length-scale at which filaments are statistically significant is between 
70 to 80/i _1 Mpc, for the LCRS —3° slice. Filamentary features longer than 80/i _1 Mpc, though identified, are not statistically 
significant. Such features arise from chance alignments of galaxies. Further, for the five other LCRS slices, filaments of lengths 
50A _1 Mpc to 70h _1 Mpc were found to be statistically significant, but not beyond. 

The a bility to produce the filament ary features observed in redshift surveys is an important test of any model for structure 
formation. IBharadwai fc Pandevl J2004) have used N-body simulations of the ACDM model with a featureless, scale invariant 
primordial power spectrum and random initial phases to investigate whether the filamentarity predicted by this model is 
consistent with that detected in the LCRS. They find that the filamentarity in an unbiased ACDM model is less than the 
LCRS. Introducing a bias b = 1.15, the model is in rough consistency with the data, and a large bias (b = 1.5) which enhances 
filamentarity at small scales and suppresses it at large scales is ruled out. The filamentarity is very sensitive to the bias , and 
it may be possible to use a quantitative analysis of filamentarity to determine the bias parameter. 

The Sloan Digital Sky Survey (SDSS) offers the possibility of studying coherent structures in the galaxy distribution over 
length-scales which are substantially larger than possible with earlier surveys like the LCRS. Among the total sky coverage of 
the publicly available SDSS Data Release One (DR1) are two strips which are centered along the celestial equator (S — 0°), 
one spanning 65° and the other 91° in r.a., and their thickness varying within S |< 2.5° in dec. We use galaxies distributed 
over the redshift range 0.02 < z < 0.2. The thickness of the three dimensional regions corresponding to these strips is much 
smaller than the two other dimensions, and we can project the galaxy distribution onto the equatorial plane (S — 0) without 
smearing out the large-scale patterns. In this paper we quantify the inter-connectivity and the shapes of the patterns in the 
resulting 2D galaxy distribution using the same techniques which were earlier applied to the LCRS. 

We restrict our analysis to volume limited subsamples described in Section 1 of this paper. The dense sampling of the 
SDSS allows us to divide the galaxies into two classes based on their luminosities and test if the inter-connectivity and the 
shapes of the patterns seen in the galaxy distribution are different for the fainter and the brighter galaxies. The large volume 
of the two SDSS equatorial strips offers further advantages over the LCRS, we discuss these in the appropriate parts of the 
paper. In this paper we have only considered the average properties of the structures, and we have not addressed questions 
pertaining to individual filaments, nor have we compared our results to N-body simulations. It is proposed to address these 
issu es in future. 

Irlovle et alJ J2Q02) have studied the 2D topology of the SDSS using the genus statistics. In addition to the two equatorial 
strips used in this paper, they have also used a strip at high declinations. They find that their res ults are consistent with 
those from ACDM N-body simulations and also with those from a similar analysis of the 2dFGRS ijrlovle. Vogelev. fc Gottl 
2002). They have also divided the galaxies by colour and separately analyzed them to find that the distribution of the red 
galaxies shows a shift to a meatball topology relative to the blue galaxies and t he full sample, reflecting the fact that red 
galaxies are distributed in more compact, high density regions. linkage et alJ i2003l) have used Min kowski Functionals to study 
the morphology of the patterns in the galaxy distribution in a preliminary sample from the SDSS. lDoroshkevich et all l|2004h 
have used the Minimal Spanning Tree to identify sheets and fila ments in the SPSS DR 1. In both these works the authors 
find their results to be consis t ent w ith ACDM N-body simulations Einastc^^dJ (2003) have studied the super-cluster void 
network in the SDSS. IShethl hooj) has used a technique SURFGEN JSheth et al.ll2003h to study the geometry, topology 
and morphology of the su perclusters in mock SDSS catalog ues and find that the filamentarity is the dominant morphology 
of the large superclusters. Ishandarin. Sheth. fc SahnJ ({2004^1 studied the large-scale network in the dark matter density field 
in N-body simulations from VIRGO consortium using SURFGEN and noticed that the individual superclusters and voids 
exh ibits a significant amount of substructures as indi cated by their genus values. 

iBasilakos. Plionis. fc Rowan-Robinsonl l)200l|) and lKolokotronis. Basilakos. fc Plionisl ((2002) have studied the super-cluster 
void network in the PSCz and the Abell/A CO cluster catalogue respectively, finding filamentarity to be the dominant feature. 
IPimbblet. Drinkwater. fc Hawkrigj i2004) performed an analysis of the frequency and distribution of intercluster filaments in 
the 2dfGRS with a filament classification scheme based on their visual morphology and reported that massive clusters have 
larger number of filaments. 

Traditionally, correlation functions have been used to quantify the statistical properties of the galaxy distribution with the 
two-point correlation function and its Fourier t ransform, the power spectrum, receiving most of the attention. For the S DSS 
this i ncludes analysis of the power spectrum(egJSzala^ i et_alJ200S ; Tegmark et all2002|;jTegmark et al.|200 4al ; lDodelson et all 
2002), the two-poi nt correlation functio n (es. IZehavi et al.ll200S : Connolly et, alJl2002l : llnfante et al.ll2002l ~ and the higher 
order moments (es. ISzaoudi et al]l2002l L 

We have used a ACDM cosmological model with f2 m o = 0.3, Oao = 0.7 and h — 1 throughout. 

We next present an outline of the paper. In Section 2 we describe the data and the method of analysis. The results are 
presented in Section 3, and finally we present discussion and conclusions in Section 4. 
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Figure 1. This shows the galaxy distribution in two of the volume limited subsamplcs which we have analyzed. The other subsamplcs, 
described in Section 2.1 and listed in Table I, were all extracted from the two subsamples shown here and the visual appearance of these 
subsamples is very similar. 
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Figure 2. This shows a single realization of the random ; 
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ulaxy distribution in two of the SDSS strips that we have analyzed. 



2 DATA AND METHOD OF ANALYSIS 
2.1 SDSS and the data 

The SDSS is a wide-field photometric and spectroscopic survey of the high galactic latitude sky visible from the Northern 
hemisphere. It uses a dedicated 2.5 m telescope at Apache Point Observatory in New Mexico. The primary goals of the 
SDSS are to image 10,000 square degrees of the Northern Galactic Cap and three ~ 200 square degree stripes in the Southern 
Galactic Cap in five wavebands namely u, g, r, i and z, and de termine spectrosco pic redshifts of approximately 10 8 galaxies and 
10 5 quasars. A high level overview of the SDSS is provided by [ Yo rk et al J <l2000l) . The details of the software and data products 
of the Early Data Release(EDR) are described in lstoughton et alT]20o3) . The de tails and the updates of the data for the Data 
Release One (DR1) and Second Data Release (DR2) can be found in two papers. lAbazaiian et all J2003I) a ndlAbazai ian et all 
J2004h respectively. Othe r technical details of the SDSS are the descriptions of the phot ometric camera I Grrnn^t^dT^OgfT 



the phot ometric system llFukugita et al.lll99rJ : ISmith et al]l2002h . photometric monitor faogg et al-lbooJ) and photo metric 
analysis jLupton et al.ll2002f) . Th ere are other importan t articles which covers astrometric calibrati ons iPier et all 120031) . 
selection of spectroscopic samples feisenstein et alJl200ll : Istrauss et aljl2002}l and spectroscopic tiling felanton et alJl2003a ) . 



Our present analysis is based on SDSS DR1 galaxy redshift data. In this paper we analyze two equatorial strips (celestial 
equator) one in Northern Galactic Cap (NGP) which covers the region 145° < a < 236° , and another in the Southern Galactic 
Cap (SGP) covering 351° < a < 56°, both with varying thickness in the range —2.5° < 8 < 2.5°. This contains 38,838 galaxies 
having redshift in the range 0.02 < z < 0.2 with the selection criteria that the extinction corrected Petrosian r band magnitude 
is r v < 17.77. 

For the current purpose we selected only the galaxies lying within — 1° < 8 < 1° as both the equatorial strips have 
complete coverage in this declination range. We construct a volume limited samples over the redshift range 0.08 < z < 0.2 
by restricting the extinction corrected Petrosian r band apparent magnitude in the range 14.5 < m r < 17.5 and absolute 
magnitude in the range —22.6 < M r < —21.6. This reduces the number of galaxies, but also offers some advantages as the 
radial selection function is approximately uniform so the variation in number density of galaxies is caused by clustering only. 
The above redshift limit was chosen so as to get a good compromise between the number of galaxies and the volume of the 
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Tabic I 



Sample 


Abs. Mag 




No. of galaxies 


NGP 


-22.6 < M r < 


-21.6 


3297 


NGP faint 


-22.1 < M r < 


-21.6 


2221 


NGP bright 


-22.6 < M r < 


-22.1 


1076 


NGP uniform thickness 


-22.6 < M r < 


-21.6 


1936 


SGP 


-22.6 < M r < 


-21.6 


2018 


SGP faint 


-22.1 < M r < 


-21.6 


1206 


SGP bright 


-22.6 < Mr < 


-22.1 


812 


SGP uniform thickness 


-22.6 < M r < 


-21.6 


1096 



sample. We further divide the absolute magnitude range into two equal parts in order to produce separate volume limited 
samples of the fainter and brighter galaxies. We finally have 5315 galaxies distributed in two wedges, spanning 91° (NGP) 
and 65° (SGP) in r.a., both with thickness 2° centered along the equatorial plane extending from 235 /i _1 Mpc to 571 /i _1 Mpc 
comoving in the radial direction. 

The analysis using Shuffle requires us to cut the entire survey area into squares and shuffle them around. The thickness 
of the wedges described above increases with radial distance varying from 8.2 h~ x Mpc to 20 h~ 1 Mpc. For the Shuffle analysis 
we have used subsamples of uniform thickness 8.2 hT 1 Mpc extracted from the wedges described above. 

Table I summarizes some properties of all the subsamples which we have used in our analysis. The galaxy positions in 
all the subsamples were projected onto the equatorial plane to obtain the 2D galaxy distributions (Figure which we have 
analyzed in the rest of the paper. A visual inspection of Figure Q reveals the presence of large-scale coherent patterns namely 
the interconnected network of filaments and voids which we now proceed to quantify. For both the NGP and SGP subsamples 
(Table I), we have generated 9 random realizations (Figure |5j which contain exactly the same number of galaxies randomly 
distributed over the same volume as the actual data. 



2.2 Method of Analysis 

For each of the subsamples described in the previous subsection, the 2D galaxy distribution was embedded in a 1 ft -1 Mpc x 
1 ft -1 Mpc 2D rectangular grid. Grid cells having a galaxy within them are termed as filled and were assigned the value 1, 
whereas cells with no galaxy are termed as empty and were given the value 0. The grid cells which are beyond the boundaries 
of the survey were assigned a negative value in order to distinguish them from the empty cells within the survey area. The 
net result of is that the galaxy distribution is now represented through a distribution of Is and 0s located on a 2D grid. 

The next step is to use an objective criteria to identify the coherent large-scale structures visible in the galaxy distribution. 
We use a "friends-of-friends" (FOF) algorithm to identify interconnected regions of filled cells which we refer to as clusters. 
In this algorithm any two adjacent filled cells are referred to as friends. Clusters are defined through the stipulation that any 
friend of my friend is my friend. The distribution of Is on the grid is very sparse with only ~ 1% of the cells being filled. 
Also, the filled cells are mostly isolated, and the clusters identified using FOF, which contain only a few cells each, do not 
correspond to the large-scale coherent structures seen in the SDSS strips. It is necessary to smoothen or coarse-grain the 
galaxy distribution so that the large scale structures may be objectively identified. 

The coarse-graining is carried out by gradually making each filled cell fatter until the filled cells overlap and they finally 
fill up the whole survey area. In every iteration of coarse-graining we fill up all the empty cells which are adjacent to filled 
cells, causing every filled cell to grow fatter. This causes clusters to grow, first because of the growth of filled cells, and then by 
the merger of adjacent clusters as they overlap. This process is illustrated in Figure|2] At the initial stages of coarse-graining, 
the patterns which emerge from the distribution of Is and 0s closely resembles the coherent structures seen in the galaxy 
distribution. As the coarse- graining proceeds, the clusters become very thick and fill up the entire region washing away any 
visibly distinct pattern. The filling factor FF, defined as the fraction of cells within the survey area that are filled, ie. 

Total No. of Filled Cells 

_r r ( _1_ ) 

Total No. of Cells Inside the Survey Area 

increases from FF ~ 0.01 to FF = 1 as the coarse graining proceeds. So as to not restrict our analysis to an arbitrarily chosen 
level of smoothing, we analyze the clusters identified in the pattern of Is and 0s after each iteration of coarse-graining ie. the 
whole range of FF. 
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Figure 3. This shows the NGP and SGP strips at different levels of coarse-graining indicated by N in the figure along with the 
corresponding filling factor FF. Only the large clusters are shown, different shades (colours) being used to demarcate individual clusters. 



2.2.1 LCS and Shapefinders 

At each level of coarse-graining we identify the largest cluster and calculate the Largest Cluster Statistic (LCS) defined as 
the fraction of the filled cells in the largest cluster 

No. of Filled Cells in the Largest Cluster 
= Total No. of Filled Cells ' ' ' 

We study the growth of LCS with increasing FF (Figures and 01 to quantify the tendency of clusters to get interconnected 
as the coarse-graining proceeds (Figure The transition from many small clusters to a network of interconnected filaments 
running across almost the entire survey, which is the onset of percolation, is manifested by a sharp increase in LCS. 

The geometry and topology of a two dimensional cluster can be described by the three Minkowski functionals, namely its 
area S, perimeter P, and genus G. It is possible to quantify the shape of the cluster using a single 2D "Shapefinder" statistic 
(Bharadwaj et al. 2000) which is defined as the dimensionless ratio 
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Figure 4. This figure exhibits how Shuffle works. The left panel shows the galaxy distribution in the NGP subsample and the right 
panel shows a shuffled realization generated from the same data using L = 90/i _1 Mpc. A 90fc _1 Mpc x 90/i _1 Mpc grid was placed on the 
NGP data, and square blocks lying fully inside the survey region were randomly shuffled around to generate the data shown in the right 
panel. This process destroys all coherent structures spanning length-scales larger than L = 90h~ 1 Mpc in the actual data and filaments 
larger than this in the shuffled data arise from chance alignments. Among the filaments seen in the right panel, those which run across 
the block boundaries have formed purely from chance alignments. 



which by construction has values in the range < T < 1. It can be verified that T = 1 for an ideal filament which has a finite 
length and zero width, whereby it subtends no area (S = 0) but has a finite perimeter (P > 0). It can be further checked that 
T = for a circular disk, and intermediate values of T quantifies the degree of filamentarity with the value increasing as a 
cluster is deformed from a circular disk to a thin filament. 

The definition of T needs to be modified when working on a rectangular grid of spacing I. An ideal filament, represented 
on a grid, has the minimum possible width i.e. I, and its perimeter P and area S are related as P — 2S + 21. At the other 
extreme we have P 2 = 16S for a square shaped cluster on the grid. We introduce the 2D Shapefinder statistic 

(P - 40 2 ( ' 

to quantify the shape of clusters on a grid. By definition 0< J- < 1. T quantifies the degree of filamentarity of the cluster, 
with T = 1 indicating a filament and T — 0, a square, and T changes from to 1 as a square is deformed to a filament. 

The extent of filamentarity in a survey is mostly dominated by the morphology of the most massive members as a large 
cluster has a greater contribution to the overall texture of large-scale structure than an individual galaxy. We therefore want 
to give greater weight to larger objects and used the second moment of filamentarity as the indicator of average filamentarity. 
The average filamentarity F2 is defined as the mean filamentarity of all the clusters in a slice weighted by the square of the 
area of each clusters 

In the current analysis, we study the average filamentarity F2 as a function of FF (Figures|^|and|SJl to quantify the degree 
of filamentarity in each of the SDSS subsamples and the random datasets. 



2.2.2 Shuffle 

We use a statistical technique called Shuffle to determine the largest length-scale at which the filamentarity is statistically 
significant. A grid with squares blocks of side L is superposed on the original data slice (Figure [IJ. Blocks of data which lie 
entirely within the slice are then randomly interchanged, with rotation, repeatedly, to form a new shuffled slice. The shuffling 
process eliminates coherent features in the original data on scales larger than L, keeping clustering at scales below L nearly 
identical to the original data. All the structures spanning length-scales greater than L that exist in the shuffled slices are 
the result of chance alignments. At a fixed value of L, the average filamentarity in the original sample will be larger than in 
the shuffled data only if the actual data has more filaments spanning length-scales larger than L, than that expected from 
chance alignments. The largest value of L, Lmax, for which the average filamentarity of the shuffled slices is less than the 
average filamentarity of the actual data gives us the largest length-scale at which the filamentarity is statistically significant. 
Filaments spanning length-scales larger than Lmax arise purely from chance alignments. 

We have used the uniform thickness subsamples (Table I) which have thickness 8.2 h^ 1 Mpc for the Shuffle analysis. For 
each value of L we generated 24 different realization of the shuffled slices. To ensure that the edges of the blocks which are 
shuffled around do not cut the actual filamentary pattern at exactly the same place in all the realizations of the shuffled 
data, we randomly shifted the origin of the grid used to define the blocks. The values of FF and P2 in the 24 realizations 
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differ from one another and from the actual data at the same stage of coarse-graining. So as to be able to quantitatively 
compare the shuffled realizations with the actual data, we interpolate the values of F2 in the shuffled realization at the values 
of FF obtained for the actual data. The mean F2 [Shuffled] and the variance (AF 2 [Shuffled]) 2 of the average filamentarity was 
determined for the shuffled data at each value of FF using the 24 realizations. The difference between the filamentarity of the 
shuffled data and the actual data was quantified using the reduced \ 2 P er degree of freedom 

= J_ V (^Actual] -^Shuffled])) 2 
v N p ^ (AF 2 [Shuffled]) 2 U 

a — 1 

where the sum is over different values of the filling factor FF. 



3 RESULTS 

3.1 Comparison with random samples 

Figure shows the results for the NGP and SGP strips plotted along with the values for their random counterparts. For 
both NGP and SGP we have 9 random realizations which were analyzed in exactly the same way as the actual data. The 
values of the filling factor FF differ from realization to realization at the same level of coarse-graining, and we interpolated 
the values of LCS and F2 at the values of FF obtained for the actual data, and these were used to obtain the mean and the 
f — o error-bars shown for the random data in the figure. It may be noted that the values of LCS and F2 show very little 
variation from realization to realization, and this is reflected in the very small error-bars. This is a consequence of the high 
number density and large area of the two SDSS strips analyzed here. 

For both the actual data and the random realizations, the LCS, has a very small value at low values of FF. For the actual 
data, LCS is below 0.2 up to a filling factor of 0.4 ie. the largest cluster contains less than 20% of the filled cells when around 
40% of the cells in the survey area are filled. A transition is seen to occur at FF in the range 0.5 — 0.6 for both the NGP 
and SGP strips, and the largest cluster contains more than 80% of all the filled cells at a filling factor FF = 0.65. This sharp 
transition from a set of small clusters at FF < 0.4 to a single large cluster which is a network of filaments running across 
almost the entire survey is referred to as the percolation transition and it is found to occur at a threshold value of the filling 
factor around FF = 0.5. The LCS of the random data is less than that of the actual data for nearly the entire range of FF, 
the two curves being many factors of a apart. Also, in the random data LCS < 0.2 all the way to FF = 0.5 whence it exhibits 
a sudden rise to LCS = 0.7 at FF = 0.65. The faster growth of LCS with increasing FF in the two SDSS strips as compared 
to the random Poisson point distribution, and the onset of percolation at a lower value of FF in the actual data show the 
presence of network like topology in the galaxy distribution. 

Turning our attention next to the average filamentarity F2 , we find that initially F2 is larger in the actual data as compared 
to the random data. For both the actual and the random data F2 increases with successive iterations of coarse-graining and 
reaches a value F2 ~ I at FF = 0.6. This corresponds to a situation where nearly all the clusters have merged into a single 
cluster (LCS ~ 0.8) and further smoothing results in only fattening this cluster causing the average filamentarity to fall. The 
region beyond FF = 0.6 is not of importance and is not considered in our analysis. The average filamentarity of the actual 
data is larger than that for a Poisson point distribution for the entire range of filling factor FF < 0.6. This shows that the two 
SDSS strips analyzed here are largely dominated by filaments, significantly in excess of that expected in a random distribution 
of points. 



3.2 Luminosity Dependence 

The bright and faint subsamples were separately analyzed to investigate the luminosity dependence, if any, of the connectivity 
and the shapes of the patterns in the galaxy distribution. For both the SGP and the NGP the subsample of faint galaxies 
(Table I) contains roughly 1.5 to 2 times the number of galaxies in the subsample of bright galaxies distributed over the 
same region. So as to compare the bright and faint subsamples at the same galaxy number density we randomly extracted a 
subset of the faint subsamples so that the number exactly matches the bright subsamples. Five such randomly chosen subsets 
were used to make bootstrap estimates of the mean and the 1 — a fluctuations of LCS and F2 as function of FF (Figure [SJ. 
These were compared with LCS and Fi for the bright subsample to test if there is any statistically significant evidence for a 
luminosity dependence. 

We find that the values of both LCS and F2 are larger for the faint subsamples as compared to the bright ones. The 
percolation transition occurs at a smaller value of FF for the faint subsamples, indicating that a network like topology is 
more dominant in the distribution of faint galaxies as compared to the bright ones. Also, the faint galaxies have a more 
filamentary distribution, quantified by F2 , as compared to the bright galaxies. Using the reduced \ 2 P er degree of freedom to 
asses the statistical significance of these differences we find that \ 2 l v = (72, 39) in NGP and SGP respectively for the Largest 
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Filling Factor 

Figure 5. This shows the Largest Cluster Statistics (LCS) and the Average Filamentarity (F2) for two of the SDSS strips together with 
the values for their random counterparts. We have used 9 realizations to determine the mean values and the I — a error-bars shown for 
the random data. 




0.2 0.3 0.4 0.5 0.6 0.2 0.3 0.4 0.5 0.6 



Filling Factor 

Figure 6. The average filamentarity and largest cluster statistics vs. filling factor for bright and faint galaxies in the volume limited 
samples. The 1 — a error-bars for faint galaxies are shown in the figure. 



Cluster Statistics and (16, 97) for the Average Filamentarity. This establishes that the luminosity dependence is a statistically 
significant effect. 



3.3 Statistical Significance of the Filaments 

We applied Shuffle to the NGP and the SGP uniform thickness subsamples varying L from 20 to 150 ft -1 Mpc in steps of 
10ft _1 Mpc. The Average Filamentarity falls substantially if the data is Shuffled using L = 20 ft" 1 Mpc (Figure 01 indicating 
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Filling Factor 

Figure 7. This shows the Average Filamentarity for the two SDSS slices together with the results for the shuffled data for three values of 
L shown in the figure. Shuffling with L = 20/i _1 Mpc causes a large drop in the Average Filamentarity showing the statistical significance 
of the filamentarity at this length-scale. The filamentarity is statistically significant up to Lmax = S0h~ 1 Mpc where the actual data lies 
above the the 1 — <r error-bars. The data is within the 1 — <r error-bars of the shuffled realizations for larger values of L, indicating that 
the filaments are not statistically significant beyond Lmax - The data point for L = 80/i _1 Mpc have been slightly shifted to prevent 
the error-bars from overlapping in the figure. 



Table fl 



L 


X 2 MNGP) 


X 2 MSGP) 


20 


16.64 


15.1 


30 


11.7 


5.2 


40 


5.8 


4.53 


50 


5.3 


1.64 


60 


3.43 


2.85 


70 


2.2 


1.8 


80 


2.33 


1.65 


90 


1.08 


0.82 


100 


0.58 


0.46 


110 


1.03 


0.66 


120 


1.13 


0.33 


130 


0.5 


0.5 


140 


0.5 


0.4 


150 


0.4 


0.3 



that a large fraction of the filaments are cut by the Shuffling mechanism and the number of filaments which are produced by 
chance alignments in the Shuffled data are less than the number of filaments destroyed. This establishes that the filaments 
do not arise from chance alignments and are statistically significant, genuine features of the galaxy distribution at this 
length-scale. 

Longer filaments survive the Shuffling process as L is increased and hence the Average Filamentarity increases with L , 
slowly approaching the values for the actual data. The values of x 2 / u given in Table II and shown graphically in Figure |H1 
quantify the difference in F2 between the Shuffled and the actual data. We find that Shuffling the data with L = 90 h~ l Mpc 
or larger doesn't result in a statistically significant drop in F2, the value of x 2 jv being ~ 1 for L > 90/i _1 Mpc. The value of 
Lmax, the largest length-scale at which Shuffling causes F 2 to fall, is 80 h' 1 Mpc for both NGP and SGP. This is the largest 
length-scale at which the filaments are statistically significant. 

Cutting the galaxy distribution into blocks of size 90/i _1 Mpc or larger and shuffling them around does not reduce the 
Average Filamentarity. Filaments spanning length-scales larger than 90 /i _1 Mpc are present in equal abundance in the Shuffled 
and the actual data showing that these filaments arise purely from chance alignments. 

The Average Filamentarity for L = 80 h~ Mpc and 90 /i -1 Mpc where we have the transition from statistically significant 
filaments to filaments that arise purely from chance is shown in Figure along with the actual data. It may be noted that we 
have restricted our analysis to 0.2 < FF < 0.6 as there are many small clusters which bear no resemblance to the filaments we 
wish to characterize for FF < 0.2 and nearly all the filaments get interconnected into a single dominant cluster for FF > 0.6. 
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4 DISCUSSION AND CONCLUSION 
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features at a high level of precisions on hitherto unprecedented length-scales. We have used the Largest Cluster Statistics 
(LCS) and the Average Filamentarity (F2) to study respectively the interconnectivity and the shapes of the patterns seen 
in the galaxy distribution in 2D projections of volume limited subsamples from the two equatorial strips in the SDSS DR1. 
These were studied as functions of the Filling Factor (FF) at different levels of smoothing, and the results compared to a 
2D Poisson point distribution. We find that at the same value of FF, the values of LCS and F2 in both the NGP and the 
SGP strips are substantially larger than those of the random distributions for nearly the entire range of FF. This indicates a 
high level of connectivity consistent with a network like topology with filaments being the dominant structures in the galaxy 
distribution. Individual filamentary structures identified at low filling factors (FF ~ 0.3) get interconnected into a network at 
the percolation transition (FF ~ 0.5 — 0.6). 

The high number density of galaxies in the SDSS allows us to test if there is evidence for luminosity dependence in 
the connectivity and the filamentarity. We find that the distribution of the brighter galaxies exhibits lesser connectivity and 
filamentarity compared to the fainter galaxies in the same region. This is consistent with the picture where the brighter 
galaxies preferentially reside in the compact high density region s whereas the fainter galaxies have a more diffuse distribution. 
Studies using N-body simulations feharadwai fc Pandevll2004) have shown that the filamentarity is highly sensitive to bias, 
with the large-scale filamentarity falling with increasing bias. The findings of this paper reaffirm that the brighter galaxies have 
a hi gher bias relative to t he fainter ones as noted from earli e r stud ies of the luminosity dependence of the galaxy clustering 
(eg. iNorberg et al.ll200ll IZehavi et al"1l2002h . | Einasto et alJ (12003^ have studied the luminosity distribution of galaxies in 
high and low density regions of the SDSS to show that the brighter galaxies are preferentially distributed in high density 
environments. Icoto et all l|2003h have studied the morphology-density relation in the SDSS(EDR) and find that this relation 
is less noticeable in the sparsest regions indicating requirment of denser environment for th e physical mechani sms responsible 
for galaxy morphological change. Studies of the connectivity using the genus statistics felovle et alJl2002l) have revealed 
colo ur dependence with t he redder galaxies showing a lesser connectivity compared to the blue ones. Iliogg et all <|2003h 
and lBlanton et alJ fe003bl) find a strong environment dependence for both the colour and luminosity for the SDSS galaxies. 
Their results indicate that the red galaxies are found preferentially in overdense regions relative to the blue galaxies. These 
observations showing the connectivity and filamentarity to depend on the luminosity and the color of the galaxies poses 
interesting questions about both, the models of galaxy formation and our understanding of the formation of the filament- void 
network. It may be noted that the colour dependence of the filamentarity has not been studied here and it is proposed to take 
this up in the future. 

Studies of the genus statistics at large values of smoothing shows the SDSS to be consistent with a Gaussian random field 
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Hovle et alJ J2002). Our analysis of the connectivity starting with the unsmoothed data and analyzing it at various levels of 
smoothing complements the earlier analysis and reveals the presence of strongly non-Gaussian features, namely the filaments. 
It is interesting to note that these filaments are the natural outcome of gravitational instability starting from Gaussian random 
initial conditions. 

We have determined the largest length-scale at which the filaments are statistically significant. We find that filaments 
spanning length-scales up to 80 ft -1 Mpc are statistically significant in both the SDSS strips we have analyzed. Filaments 
spanning scales larger than this are the outcome of purely chance alignments in the galaxy distribut ion. The analysis pre- 
sented here has distinct advantages over the earlier analysis using the LCRS jBharadwai et alj|2004h . The SDSS strips are 
substantially larger than the LCRS, and hence there are more blocks which can be shuffled around giving us better statistics 
for the results. Further, the LCRS wedges were curved and had varying thickness whereas our SDSS subsamples are flat and 
have uniform thickness. Our results are consistent with the earlier findings for the LCRS where filamentarity was found to 
be statistically significant on scales up to 70 — 80 h^ 1 Mpc in the —3° slice and 50 — 70/i _1 Mpc in the other 5 slice s. It is 
intere sting to note that the results of the analysis of mock SDSS catalogues based on ACDM N-body simulations Jshethl 
l2004h reveal the length of the longest superclusters to be ~ 60 fo -1 Mpc. A similar analysis of the supercluster-void network 
in VIRGO ACDM N-body simulations show the most massive superclusters to exceed 50 h~ Mpc in length. It may be noted 
that our analysis gives an upper limit to the linear length-scale up to which the filaments are statistically significant. The 
individual filaments may be wiggly or coiled up, and the length measured along the filament may be larger. 

The connectivity and filamentarity of the two SDSS strips are roughly consistent with each other, though we have not 
performed a quantitative comparison given the different geometries of the two subsamples. It may be noted that the filaments 
identified in our 2D an alysis may actually be the intersection of 3D planar structures. Analysis of the SDSS power-spectrum 
^Teemark et alJl2004afr shows the presence of a bump at the Fourier mode k ~ 0.05 h Mpc 1 in the power spectrum. While it 
may be speculated that the high level of filamentarity detected in the SDSS may be a consequence of thi s features, earlier anal- 
ysis u sing N-body simulations show the filamentarity to be insensitive to the presence of such a bump feharadwai fc Pandevl 

In conclusion we note that our results confirm the earlier results jBharadwai et alJbOQ^ that the filaments seen in the 
galaxy distribution are the largest known statistically significant structures in the universe. 
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